home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-03-20 | 76.0 KB | 1,395 lines |
- Design Details
- --------------
-
- This section goes into a few of the more obscure details not covered in the
- section on security analysis, such as the encryption algorithm used by SFS, the
- generation of random numbers, the handling of initialization vectors (IV's),
- and a brief overview on the deletion of sensitive information retained in
- memory after a program has terminated (this is covered in more detail in the
- section "Security Analysis" above).
-
-
- The Encryption Algorithm used in SFS
-
- Great care must be taken when choosing an encryption algorithm for use in
- security software. For example, the standard Unix crypt(1) command is based on
- a software implementation of a rotor machine encryption device of the kind
- which was broken by mathematicians using pencil and paper (and, later on, some
- of the first electronic computers) in the late 1930's[1]. Indeed, there exists
- a program called `crypt breaker's workbench' which allows the automated
- breaking of data encrypted using the crypt(1) command[2]. The insecurity of
- various other programs has been mentioned elsewhere. It is therefore
- imperative that a reliable encryption system, based on world-wide security
- standards, and easily verifiable by consulting these standards, be used.
-
- When a block cipher is used as a stream cipher by running it in CFB (cipher
- feedback) mode, there is no need for the cipher's block transformation to be a
- reversible one as it is only ever run in one direction (generally
- encrypt-only). Therefore the use of a reversible cipher such as DES or IDEA is
- unnecessary, and any secure one-way block transformation can be substituted.
- This fact allows the use of one-way hash functions, which have much larger
- block sizes (128 or more bits) and key spaces (512 or more bits) than most
- reversible block ciphers in use today.
-
- The transformation involved in a one-way hash function takes an initial hash
- value H and a data block D, and hashes it to create a new hash value H':
-
- hash( H, D ) -> H'
-
- or, more specifically, in the function used in SFS:
-
- H + hash( D ) -> H'
-
- This operation is explained in more detail in FIPS Publication 180 and ANSI
- X9.30 part 2, which define the Secure Hash Standard. By using H as the data
- block to be encrypted and D as the key, we can make the output value H'
- dependant on a user-supplied key. That is, when H is the plaintext, D is the
- encryption key, and H' is the ciphertext:
-
- plaintext H
- |
- v
- +---------+
- | SHS |<- key D
- +---------+
- |
- v
- ciphertext H'
-
- If we regard it as a block cipher, the above becomes:
-
- H' = SHS( H )
-
- which is actually:
-
- C = e( P )
-
- Since we can only ever "encrypt" using a one-way hash function, we need to run
- the "cipher" in cipher feedback mode, which doesn't require a reversible
- encryption algorithm.
-
- By the properties of the hash function, it is computationally infeasible to
- either recover the key D or to control the transformation H -> H' (in other
- words given a value for H' we cannot predict the H which generated it, and
- given control over the value H we cannot generate an arbitrary H' from it).
-
- The MDC encryption algorithm is a general encryption system which will take any
- one-way hash function and turn it into a stream cipher running in CFB mode. The
- recommended one-way hash function for MDC is the Secure Hash Standard as
- specified in Federal Information Processing Standards (FIPS) Publication 180
- and ANSI X9.30 part 2. SHS is used as the block transformation in a block
- cipher run in CFB mode as detailed in AS 2805.5.2 section 8 and ISO 10116:1991
- section 6, with the two parameters (the size of the feedback and plaintext
- variables) j and k both being set to the SHS block size of 160 bits. The
- properties of this mode of operation are given in Appendix A3 of AS 2805.5.2
- and Annex A.3 of ISO 10116:1991. The CFB mode of operation is also detailed in
- a number of other standards such as FIPS Publication 81 and USSR Government
- Standard GOST 28147-89, Section 4. The use of an initialization vector (IV) is
- as given in ISO 10126-2:1991 section 4.2, except that the size of the IV is
- increased to 160 bits from the 48 or 64 bits value given in the standard. This
- is again detailed in a number of other standards such as GOST 28147-89 Section
- 3.1.2. The derivation of the IV is given in the section "Encryption
- Considerations" below.
-
- The key setup for the MDC encryption algorithm is performed by running the
- cipher over the encryption key (in effect encrypting the key with MDC using
- itself as the key) and using the encrypted value as the new encryption key.
- This procedure is then repeated a number of times to make a "brute-force"
- decryption attack more difficult, as per the recommendation in the Public-Key
- Cryptography Standard (PKCS), part 1. This reduces any input key, even one
- which contains regular data patterns, to a block of white noise equal in size
- to the MDC key data.
-
- The exact key scheduling process for MDC is as follows:
-
- Initialization:
-
- - The SHS hash value H is set to the key IV[3].
- - The SHS data block D is set to all zeroes.
- - The key data of length 2048 bits is set to a 16-bit big-endian value
- containing the length of the user key in bytes, followed by up to 2032 bits
- of user key.
-
- SHS hash value H = key IV;
- SHS data block D = zeroes;
- key_data [0:15] = length of user key in bytes;
- key_data [16:2047] = user key, zero-padded;
-
- Key schedule:
-
- - The following process is iterated a number of times:
-
- - The 2048-bit key data block is encrypted using MDC.
- - Enough of the encrypted key data is copied from the start of the key data
- block into the SHS data block D to fill it.
-
- for i = 1 to 200 do
- encrypted_key = encrypt( key_data );
- D = encrypted_key;
-
- During the repeated encryptions, the IV is never reset. This means that the IV
- from the end of the n-1 th data block is re-injected into the start of the n th
- data block. After 200 iterations, the "randomness" present in the key has been
- diffused throughout the entire key data block.
-
- Although the full length of the key data block is 2048 bits, the SHS algorithm
- only uses 512 bits of this (corresponding to the SHS data block D) per
- iteration. The remaining 1536 bits take part in the computation (by being
- carried along via the IV) but are not used directly. By current estimates
- there are around 2^256 atoms in the universe. Compiling a table of all 2^512
- possible keys which would be necessary for a brute-force attack on MDC would
- therefore be a considerable challenge to an attacker, requiring, at least, the
- creation of another 512 * 2^256 universes to hold all the keys. Even allowing
- for the current best-case estimate of a creation time of 7 days per universe,
- the mere creation of storage space for all the keys would take an unimaginably
- large amount of time.
-
- The SFS key schedule operation has been deliberately designed to slow down
- special hardware implementations, since the encryption algorithm is rekeyed
- after each iteration. Normal high-speed password-cracking hardware would (for
- example, with DES) have 16 separate key schedules in a circular buffer, each
- being applied to a different stage of a 16-stage pipeline (one stage per DES
- round) allowing a new result to be obtained in every clock cycle once the
- pipeline is filled. In MDC the key data is reused multiple times during the 80
- rounds of SHS, requiring 80 separate key schedules for the same performance as
- the 16 DES ones. However since the algorithm is rekeyed after every iteration
- for a total of 200 iterations, this process must either be repeated 200 times
- (for a corresponding slowdown factor of 200), or the full pipeline must be
- extended to 16,000 stages to allow the one-result-per-cycle performance which
- the 16-stage DES pipeline can deliver (assuming the rekeying operation can be
- performed in a single cycle - in fact the 1024-bit keys used by MDC/SHS need to
- be processed in 7 sequential 160-bit blocks due to SHS's block size, stretching
- the pipeline to 112,000 stages). Changing the iteration count to a higher
- value will further slow down this process.
-
- The number of iterations of key encryption is controlled by the user, and is
- generally done some hundreds of times. The setup process in SFS has been tuned
- to take approximately half a second on a workstation rated at around 15 MIPS
- (corresponding to 200 iterations of the encryption process), making a
- brute-force password attack very time-consuming. Note that the key IV is
- injected at the earliest possible moment in the key schedule rather than at the
- very end, making the use of a precomputed data attack impossible. The standard
- method of injecting the encryption IV at the end of the key schedule process
- offers very little protection against an attack using precomputed data, as it
- is still possible to precompute the key schedules and simply drop in the
- encryption IV at the last possible moment.
-
- Footnote [1]: This is covered in a number of books, for example Welchman's "The
- Hut Six Story: Breaking the Enigma Codes", New York, McGraw-Hill
- 1982, and Kahns "Seizing the Enigma", Boston, Houghton-Mifflin
- 1991.
-
- Footnote [2]: Available from ftp.ox.ac.uk in the directory /src/security as
- cbw.tar.Z.
-
- Footnote [3]: Some sources would refer to this value as a `salt'. The term
- `key IV' is used here as this is probably a more accurate
- description of its function.
-
-
- Generating Random Numbers
-
- One thing which cryptosystems consume in large quantities are random numbers.
- Not just any old random value, but cryptographically strong random numbers. A
- cryptographically strong random value is one which cannot be predicted by an
- attacker (if the attacker can predict the values which are used to set up
- encryption keys, then they can make a guess at the encryption key itself).
- This automatically rules out all software means of generating random values,
- and means specialised hardware must be used.
-
- Very few PC's are currently equipped with this type of hardware. However SFS
- requires 1024 random bits for each encrypted disk, in the form of the disk key
- (see the section "Password Lifetimes and Scope" above). SFS therefore uses a
- number of sources of random numbers, both ones present in the hardware of the
- PC and one external source:
-
- - Various hardware timers which are read occasionally when the program is
- running (generally after operations which are long and complex and will be
- heavily affected by external influences such as interrupts, video, screen,
- and disk I/O, and other factors.
-
- - The contents and status information of the keyboard buffer
-
- - Disk driver controller and status information
-
- - Mouse data and information
-
- - Video controller registers and status information
-
- [ - The clock skew between two hardware clocks available on the PC. Due to
- background system activity such as interrupt servicing, disk activity, and
- variable-length instruction execution times, these clocks run out-of-phase.
- SFS uses this phase difference as a source of random numbers. NB: Not
- implemented yet].
-
- - The timing of keystrokes when the password is entered. SFS reads the
- high-speed 1.19 MHz hardware timer after each keystroke and uses the timer
- values as a source of random numbers. This timer is used both to measure
- keystroke latency when the password is entered and read at random times
- during program execution. Trials have shown that this 16-bit counter
- yields around 8 bits of random information (the exact information content
- is difficult to gauge as background interrupts, video updates, disk
- operations, protected-mode supervisor software, and other factors greatly
- affect any accesses to this counter, but an estimate of 8 bits is probably
- close enough[1]).
-
- - The timing of disk access latency for random disk reads. The exact
- operation is as follows:
-
- Read a timer-based random disk sector
- Add its contents (8 bits)
- Read the high-speed 1.19 MHz hardware timer (13 bits)
- Use the two values for the next random sector
-
- This is repeated as often as required (in the case of SFS this is 10
- times). Assuming a (currently rather optimistic) maximum of 5ms to acquire
- a sector this provides about 13 bits of randomness per disk operation. The
- number of factors which influence this value is rather high, and includes
- among other things the time it takes the BIOS to process the request, the
- time it takes the controller to process the request, time to seek to the
- track on the disk, time to read the data (or, if disk cacheing is used,
- time for the occasional cache hit), time to send it to the PC, time to
- switch in and out of protected mode when putting it in the cache, and of
- course the constant 3-degree background radiation of other interrupts and
- system activity happening at the same time. If a solid-state disk were
- being used, the hardware latency would be significantly reduced, but
- currently virtually no 386-class PC's have solid-state disks (they're
- reserved for palmtops and the like), so this isn't a major concern.
-
- An estimate of the number of random bits available from each source is as
- follows:
-
- Keystroke latency, 8 bits per key 80 bits for minimum 10-char key
- Second password entry for encryption 80 bits for minimum 10-char key
- Disk access latency, 13 bits per read 130 bits for 10 reads
- Disk sector data, 8 bits 80 bits for 10 reads
- System clocks and timers 3 bits
- Video controller information 4 bits
- Keyboard buffer information 4 bits
- Disk status information 4 bits
- General system status 4 bits
- Random high-speed timer reads 120 bits for 15 reads
- --------
- Total 509 bits
-
- These figures are very conservative estimates only, and are based on timing
- experiments with typed-in passwords and a careful scrutiny of the PC's hardware
- and system status data. For example, although the time to access a disk sector
- for a particular drive may be 10ms or more, the actual variation on that 10ms
- may only be +/- 2ms. The figures given above were taken by averaging the
- variation in times for large numbers of tests. In practice (especially with
- longer passwords) the number of random bits is increased somewhat (for example
- with a 30-character password the total may be as high as 829 bits of random
- information). However even the minimal estimate of 509 bits is adequate for
- the 512-bit key required by MDC.
-
- Each quantum of semi-random information is exclusive-ored into a 1024-bit
- buffer which is initially set to all zeroes. Once 1024 bits of buffer have
- been filled, the data is encrypted with MDC to distribute the information, and
- the first 512 bits of the 1024-bit buffer is used as the key for the next MDC
- encyrption pass. Then more data is added until, again, 1024 bits of buffer
- have been filled, whereupon the data is again mixed by encrypting it with MDC.
- This process is repeated several times, with the amount of "randomness" in the
- buffer increasing with each iteration.
-
- Before being used, this data is encrypted 10 times over with MDC to ensure a
- complete diffusion of randomness. Since the IV for the encryption is reused
- for the next pass through the buffer, any information from the end of the
- buffer is thus reinjected at the start of the buffer on the next encryption
- pass.
-
- Although this method of generating random numbers is not perfect, it seems to
- be the best available using the existing hardware. General estimates of the
- exact amount of truly random information which can be acquired in this manner
- are in the vicinity of several hundred bits. Proposed attacks all make the
- assumption that an attacker is in possession of what amounts to a complete
- hardware trace of events on the machine in question. Allowing for a reasonable
- amount of physical system security, it can be assumed that the random data used
- in SFS is unpredictable enough to provide an adequate amount of security
- against all but the most determined attacker.
-
- Footnote [1]: If an opponent can obtain several hours of keystroke timings and
- can come up with a good model including serial correlations, they
- may be able to reduce the likely inputs to the random number
- accumulator to a somewhat smaller value, or at least bias their
- guesses to fall within the range of likely values.
-
-
- Encryption Considerations
-
- When a block cipher is converted to handle units of data larger than its
- intrinsic block size, a number of weaknesses can be introduced, depending on
- the mode of operation which is chosen for the block cipher. For example, if
- two identical ciphertext blocks are present in different locations in a file,
- this may be used to determine the plaintext. If we can find two identical
- blocks of ciphertext when cipher block chaining (CBC) is used, then we know
- that:
-
- P[ i ] = d( C[ i ] ) ^ C[ i-1 ]
- P[ j ] = d( C[ j ] ) ^ C[ j-1 ]
-
- where C is the ciphertext, P is the plaintext, and e() and d() are encryption
- and decryption respectively. Now if C[ i ] = C[ j ], then d( C[ i ] ) =
- d( C[ j ] ), which cancel out when xor'd so that:
-
- P[ i ] ^ C[ i-1 ] = P[ j ] ^ C[ j-1 ]
-
- or:
-
- P[ j ] = P[ i ] ^ C[ i-1 ] ^ C[ j-1 ]
-
- Knowing C[ i ] and C[ j ] we can determine P[ i ] and P[ j ], and knowing
- either P[ i ] or P[ j ] we can determine the other.
-
- Something similar holds when cipher feedback (CFB) mode is used, except that
- now the decryption operation is:
-
- P[ i ] = e( C[ i-1 ] ) ^ C[ i ]
- P[ j ] = e( C[ j-1 ] ) ^ C[ j ]
-
- Now if C[ i ] = C[ j ] then e( C[ i ] ) = e( C[ j ] ) (recall that in CFB mode
- the block cipher is only ever used for encryption), so that they again cancel
- out, so:
-
- P[ i ] ^ e( C[ i-1 ] ) = P[ j ] ^ e( C[ j-1 ] )
-
- or:
-
- P[ i ] = P[ j ] ^ e( C[ i-1 ] ) ^ e( C[ j-1 ] )
-
- In general this problem is of little consequence since the probability of
- finding two equal blocks of ciphertext when using a 160-bit block cipher on a
- dataset of any practical size is negligible. More esoteric modes of operation
- such as plaintext feedback (PFB) and ones whose acronyms have more letters than
- Welsh place names tend to have their own special problems and aren't considered
- here.
-
- The problem does become serious, however, in the case of sector-level
- encryption, where the initialization vector cannot be varied. Although the IV
- may be unique for each sector, it remains constant unless special measures such
- as reserving extra storage for sector IV's which are updated with each sector
- write are taken. If a sector is read from disk, a small change made to part of
- it (for example changing a word in a text file), and the sector written out to
- disk again, several unchanged ciphertext/plaintext pairs will be present,
- allowing the above attack to be applied. However, there are cases in which
- this can be a problem. For example, running a program such as a disk
- defragmenter will rewrite a large number of sectors while leaving the IV
- unchanged, allowing an opponent access to large quantities of XOR'd plaintext
- blocks simply by recording the disk contents before and after the
- defragmentation process. Normally this problem would be avoided by using a
- different IV for each encrypted message, but most disk systems don't have the
- room to store an entire sectors worth of data as well as the IV needed to
- en/decrypt it.
-
- An additional disadvantage of the CFB encryption mode is that the data in the
- last block of a dataset may be altered by an attacker to give different
- plaintext without it affecting the rest of the block, since the altered
- ciphertext in the last block never enters the feedback loop. This type of
- attack requires that an opponent possess at least two copies of the ciphertext,
- and that they differ only in the contents of the last block. In this case the
- last ciphertext block from one copy can be subsituted for the last ciphertext
- block in the other copy, allowing a subtle form of message modification attack.
- In fact in combination with the previously mentioned weakness of CFB, an
- attacker can determine the XOR of the plaintexts in the last block and
- substitute an arbitrary piece of "encrypted" plaintext to replace the existing
- data.
-
- There are several approaches to tackling this problem. The most simplistic one
- is to permute the plaintext in a key-dependant manner before encryption and
- after decryption. This solution is unsatisfactory as it simply shuffles the
- data around without necessarily affecting any particular plaintext or
- ciphertext block. The desired goal of a change in any part of the plaintext
- affecting the entire dataset is not achieved.
-
- A better solution is to encrypt data twice, once from front to back and then
- from back to front[1]. The front-to-back pass propagates any dependencies to
- the back of the dataset, and the back-to-front pass propagates dependencies
- back to the front again. In this way a single change in any part of the
- plaintext affects the entire dataset. The disadvantage of this approach is
- that it at least halves the speed of the encryption, as all data must be
- encrypted twice. If the encryption is done in software, this may create an
- unacceptable loss of throughput. Even with hardware assistance there is a
- noticeable slowdown, as no hardware implementations easily support backwards
- encryption, requiring the data to be reversed in software before the second
- pass is possible.
-
- The best solution is probably to use a word-wise scrambler polynomial like the
- one used in SHA. With a block of plaintext P this is:
-
- P[ i ] ^= P[ i-K1 ] ^ P[ i-K2 ]
-
- with suitable values for the constants K1 and K2. If K2 is chosen to be 5 (the
- SHA block size in words) then the initial values of the 5 words (which can be
- thought of as as P[ -5 ]...P[ -1 ]) are simply the sectorIV. The value of K1
- is arbitrary, SFS uses a value of 4.
-
- This technique is used by first setting the initial values of the 5 words to
- the sectorIV. The scrambler function is then run over the entire data block,
- propagating all dependencies to the last 5 words in the block. These last 5
- words are then used as the IV for the CFB encryption of the entire block. In
- this way the encryption IV depends on all the bits in the block, and the
- scrambling does a moderately good job of breaking up statistical patterns in
- the plaintext. No information is lost, so no randomness in the sectorIV is
- misplaced.
-
- This also provides resistance to the selective-modification attack which allows
- an attacker to change selected bits in the last block of a CFB-encrypted
- dataset without damage. By destroying the IV used in the CFB encryption, the
- first block is completely corrupted, which is unlikely to go unnoticed.
-
- To decrypt a dataset encrypted in this manner, the first 5 words of ciphertext
- are shifted into the feedback path, and the remainder of the dataset is
- decrypted in the standard manner. The last 5 decrypted words are then used as
- the IV to decrypt the first encrypted block. Finally, the scrambler is run
- over the recovered plaintext to undo the changes made during the encryption
- scrambling.
-
- The overall en/decryption process used by SFS, in the case of 512-byte sectors
- and 32-bit words (so that each sector contains 128 words), is:
-
- Encryption:
-
- using sectorIV[ 0 ]...sectorIV[ 4 ] as the scrambler IV
- scramble data[ 0 ]...data[ 127 ]
- using data[ 127-5 ]...data[ 127-1 ] as the encryption IV
- encrypt data[ 0 ]...data[ 127 ]
-
- Decryption:
-
- using data[ 0 ]...data[ 4 ] as the encryption IV
- decrypt data[ 5 ]...data[ 127 ]
- using data[ 127-5 ]...data[ 127-1 ] as the encryption IV
- decrypt data[ 0 ]...data[ 4 ]
- using sectorIV[ 0 ]...sectorIV[ 4 ] as the scrambler IV
- scramble data[ 0 ]...data[ 127 ]
-
- where the scrambling operation is:
-
- data[ i ] ^= data[ i-4 ] ^ data[ i-5 ]
-
- as outlined above. Note that the i-4 and i-5 th values referred to here are
- the original, scrambled values, not the descrambled values. The easiest way to
- implement this is to cache the last 5 scrambled values and cyclically overwrite
- them as each word in the data buffer is processed.
-
- Footnote [1]: To be precise, you need some sort of feedback from the end of
- a block on the first encryption pass to the start of the block
- on the next encryption pass. A block can be encrypted forwards
- twice as long as the IV is wrapped back to the start of the
- block for the second encryption pass.
-
-
- A Discussion of the MDC Encryption Algorithm[1]
-
- (A word on notation: The notation {0,1}^k is used to mean the set of all bit
- strings of length k, and {0,1}* means the set of all bit strings, including the
- empty string. Any message can be viewed as a bit string by means of a suitable
- encoding).
-
- The encryption method used by SFS is somewhat unusual, and in some respects is
- similar to Merkle's "Meta Method" for obtaining cryptosystems[2]. The method
- relies on the existence of secure one-way hash functions. A hash function is a
- function that takes as input an arbitrary number of bits and produces a
- fixed-sized output called the "message digest". In other words, hash functions
- have the form
-
- h : {0,1}* --> {0,1}^k for some fixed k,
-
- and the hash of a message M is defined to be h( M ). A secure one-way hash
- function is a hash function with the following properties:
-
- 1. For each message M, it is easy to compute h( M ).
-
- 2. Given M, it is computationally infeasible to compute M' with
- h( M ) = h( M' ) (secure against forgery).
-
- 3. It is computationally infeasible to compute M and M' with
- h( M ) = h( M' ) (secure against collisions).
-
- For a good, but rather technical, discussion of hash functions, see
- "Contemporary Cryptology. The Science of Information Integrity" edited by
- Gustavus Simmons, IEEE Press, 1992 (ISBN 0-87942-277-7).
-
- The terms "easy to compute" and "infeasible to compute" can be given more
- precise definitions, but we'll settle for this informal terminology for now.
- Roughly speaking, "easy to compute" means that it will take a tolerable amount
- of time to compute the answer, even on a rather small machine; "infeasible to
- compute" means that it should take eons to find out a particular result, even
- when using millions of computers of the fastest conceivable technology in
- parallel.
-
- Examples of hash functions include the MD2, MD4, and MD5 hash functions,
- developed by Ron Rivest of RSA Data Security, Inc., which have been (at least
- in the case of MD4 and MD5) placed in the public domain, and the Secure Hash
- Standard SHS, developed by NIST (with significant input from the NSA). The
- existence of secure one-way hash functions has not been proven, although there
- exist some strong candidates, including MD5 and SHS.
-
- The reference implementations of the above hashing functions include one
- interesting aspect which makes it possible to use them as encryption functions.
- Since the hashing of a very large amount of data in one sweep is not desirable
- (because all the data would have to be in memory at the time of hashing), most
- hashing algorithms allow data to be hashed incrementally. This is made
- possible by augmenting the definition of a hash function to include the state
- of the last hashing operation. In other words, a hash function now has the
- form
-
- h : {0,1}^k x {0,1}* --> {0,1}^k,
-
- where the first argument is the previous hash value, and the hash of a message
- M = ( M1, M2, ..., Mn ) is defined to be
-
- h( h( ...( h( h0, M1 ), M2 ), ... ), Mn ).
-
- (The value of all the h evaluations must not change if the message is broken up
- into blocks of different lengths, but all of the previously mentioned hash
- functions have that property). Here, h0 is a fixed, known initial value that
- is used in all hashing calculations.
-
- This is not the way "real" hash functions behave, but it is close enough. For
- example, the MD5 hashing function has "initialization", "updating", and
- "finalization" parts, where the finalization part appends the number of hashed
- bytes to the message, hashes one final time, and returns the final hash value.
- This means that the hashing "context" must include the number of bytes hashed
- so far, without it being a part of the hash value. The hash function can be
- said to have "memory".
-
- If we assume that h is a secure one-way hashing function, we can now use such
- an h as a scrambling device. For example, if we set E( M ) = h( h0, M ) for
- every message M, M will not be recoverable from E( M ), because h is secure by
- definition. Another method would be to supply M to any standard MSDOS or UNIX
- utility and use the resulting error message as the ciphertext (remembering that
- a computer is a device for turning input into error messages). However, there
- are still two problems to be solved before we can use hash functions as
- encryption functions:
-
- 1. The scrambling process is not controlled by a key.
-
- 2. The scrambling process is not invertible, so there is no way to
- decrypt the ciphertext.
-
- Both problems can be solved by interchanging the roles of hash and data and by
- using CFB mode in the encryption process. In other words, let K be an
- arbitrarily long key, let M = ( M1, ..., Mn ) be a message, broken up into
- chunks of k bits, let IV be an initialization vector, and set
-
- C1 = M1 xor h( IV, K )
- Ci = Mi xor h( C( i-1 ), K ) for 1 < i <= n.
-
- This is sent to the recipient, who easily recovers the plaintext by
-
- P1 = C1 xor h( IV, K )
- Pi = Ci xor h( C( i-1 ), K ) for 1 < i <= n,
-
- since we have
-
- P1 = ( M1 xor h( IV, K ) ) xor h( IV, K )
- = M1 xor ( h( IV, K ) xor h( IV,K ) ) because xor is associative,
- = M1 xor 0, because x xor x = 0,
- = M1, because x xor 0 = x,
-
- and similarly for the Pi's. This method of encryption also offers more
- security than using ECB mode, assuming that this were possible with hash
- functions, since the plaintext is diffused over the entire ciphertext,
- destroying plaintext statistics, and thus foiling straightforward ciphertext
- correlation attacks.
-
- This method can clearly be used for any hash function which can hash
- incrementally. Thus, it is a "Meta Method" for turning hash functions into
- encryption functions. This is called the Message Digest Cipher (MDC) method of
- encryption. Specific instances of the method have the name of the hash
- function added as a suffix. For example, the MDC method applied to the MD5
- hash function would be referred to as MDC/MD5. SFS uses MDC/SHS.
-
- Having analysed the inner workings of MDC, at least one theoretical attack on
- the algorithm should be mentioned. There are certain properties of hash
- functions which may make them unsuitable for use as encryption algorithms. For
- example suppose knowledge of a 160-bit input/output pair to SHS leaks a
- fraction of a bit of information about the data being hashed, maybe a quarter
- of a bit. This allows a search of 2^159.75 data blocks to find another data
- block that causes the given input-output transformation, and thus finds a
- second message which produces the same hash value. This problem is not
- significant when SHS is used as a cryptographic hash function, since it only
- reduces the search space by 16% from the full 2^160 possibilities. However
- when SHS is used for encryption, it may be possible to accumulate these quarter
- bits, so that after 2560 blocks (50K) of known plaintext, enough bits have been
- accumulated to compute the encryption key. This is because multiple
- input/output pairs are available for a given data block, and each one puts more
- constraints on the block until eventually you have the actual value can be
- determined.
-
- If a hash function is has the properties given above and no such information is
- leaked, it can serve to create a strong encryption algorithm, but a serious
- weakness in the encryption algorithm is not necessarily a serious weakness in
- the hash function. To date noone has ever demonstrated such a weakness, and
- there are a number of similar "what if" arguments which can be used against
- most encryption schemes. For example if it were possible to build a quantum
- computer then it could be used to break many of the most popular public-key
- encryption schemes in use today. The reason that these schemes aren't being
- abandoned is that it is very unlikely that any computer of this form will be
- built, and that if someone does manage it then the implications will be far
- more serious than just the breaking of a number of encryption schemes.
-
- Footnote [1]: Most of this analysis was contributed by Stephan Neuhaus,
- <neuhaus@informatik.uni-kl.de>
-
- Footnote [2]: This is discussed further in Ralph Merkle's paper "One Way Hash
- Functions and DES", Crypto '89 Proceedings, Springer-Verlag,
- 1989 (volume 435 of the Lecture Notes in Computer Science
- series).
-
-
- Smart cards and SFS
-
- Due to their sealed, self-contained nature, smart cards provide a reasonable
- amount of security against many forms of attack such as the trojan horse
- variety with which it is so easy to compromise security on a more
- general-purpose computer such as a PC. However in the last few years a large
- amount of expertise in attacking smart-card based systems has been gathered by
- people who target pay-TV systems which rely on smart-card based processors for
- security. Pay-TV pirates have been very successful in picking apart supposedly
- secure processors in order to clone them or defeat their encryption.
- Typically, this might be done by heating the device to 150 C and then
- dissolving it in fuming nitric acid. An alternative is to use sulphuric acid
- at 300 C. By using a special fixture which produces a small jet via capillary
- action, the part of the package over the chip die can be removed.
-
- Once the packaging has been removed, only the bond wires are left to hold the
- leads to the die so the IC will fall apart, leaving the die available for
- analysis. If done correctly, this disassembly of the device will allow the
- part to remain functional. One generation of the Sky TV smart cards were
- reverse-engineered using this technique to get to the chips and a scanning
- electron microscope to read out the contents of the individual ROM cells.
-
- Another commonly-used technique is to selectively erase the security fuses used
- to protect some types of secure microprocessor, allowing the contents of the
- ROM to be read out and disassembled. For example, the security bits in the
- 8751 microcontroller aren't located anywhere near the EPROM in the device, and
- can be selectively erased without affecting the main memory contents, which are
- now unprotected and can be read out. Another way to bypass the security bits
- in this device is to manipulate the voltages and control signals fed to it,
- which causes it to become confused about whether it should be accessing the
- internal, protected memory, or external, unprotected memory. By using code
- located in the external memory it is possible to access the contents of the
- internal memory. This technique has been used to recover the memory contents
- of an 87C51 on a smart card with all security features enabled.
-
- Sometimes it is possible to find an address on the card which contains an
- opcode corresponding to a jump to code stored in external memory, so that by
- changing the external program, it is possible to sieze control of the card
- (this method was used to hack the Futuretron video scrambling system). It is
- occasionally possible to gradually erase the security fuse in one-time
- programmable and EPROM-programmable secure processors by slowly increasing or
- decreasing the supply voltage fed to the chip. Although this generally erases
- or damages the data on the chip as well, it is possible to recover the contents
- by combining information read from several chips.
-
- A much more sophisticated way to reverse-engineer chips uses thermal neutron
- imaging to perform a non-invasive analysis of the device. Intense pulsed
- neutron source (IPNS) thermal neutrons can be used to image crystal structures
- of many angstroms, and is a non-ionizing form of radiation which can penetrate
- several meters of material. By setting the pulses to pass through cold crystal
- structures, coherent waves of neutrons can be generated and various layers of a
- chip can be focused on and analyzed by moving either the chip or the detector.
- Although the electrical nature of the structure such as the values of any
- stored charges can't be seen this way, it is possible to duplicate the
- structure with a non-invasive method and then experiment with the voltages in
- the duplicate.
-
- This method of analysis requires vast resources, but there are many countries
- with the necessary nuclear reactors, and creating thermal coherent neutron
- waves isn't excessively difficult. The method is also very difficult to
- protect against.
-
- Attacks of this sort, which involve physical analysis of the card contents and
- which rely for their success on the ability to read out the internal circuitry
- and memory contents of the card, apply only to cards using ROM or EPROM
- technology, in which (typically) a memory cell is implemented as a fuse which
- is blown when the card is programmed. The blown and non-blown fuses can then
- be read out using one of the techniques described above, and the memory
- contents reconstructed.
-
- In contrast, the cards used with SFS use EEPROM memory cells rather than EPROM
- ones. In an EEPROM, data is stored as the presence or abscence of an
- electrical charge, which means that the contents of a memory cell cannot be
- determined by a physical examination as they can for a conventional ROM or
- EPROM cell. An attack therefore requires disassembling the package and probing
- the contents of each memory cell electrically, which requires significantly
- more resources than an examination under an electron microscope.
-
- As an additional security precaution, SFS encrypts all critical data stored on
- the card in the same way that it encrypts data stored in the SFS volume header.
-
- In normal operation, each SFS disk volume is encrypted with a unique disk key
- which is completely unrelated to the the user passphrase, or key. When the
- user key is entered, it is used to decrypt the disk key, and the disk key is
- then used to decrypt the encrypted volume. There is no correlation between the
- user key and the disk key, so that revealing the disk key does not reveal the
- user key. This access mechanism looks as follows:
-
- + User - - - + + Encrypted volume - - - - - - - - - - - - +
-
- | +--------+ | decrypt | +--------+ decrypt +--------------+ |
- |User Key| -----------> |Disk Key| -----------> |Encrypted Data|
- | +--------+ | | +--------+ +--------------+ |
-
- + - - - - - - + + - - - - - - - - - - - - - - - - - - - - - +
-
- When used with a smart card, the user key is instead used to decrypt a key
- stored in the smart card which is in turn used to decrypt the disk key:
-
- + User - - - + + Smart card +
-
- | +--------+ | decrypt | +--------+ | decrypt
- |User Key| -----------> |Card Key| ----------+
- | +--------+ | | +--------+ | |
- |
- + - - - - - - + + - - - - - - + |
- |
- +-------------------------------+
- |
- | + Encrypted volume - - - - - - - - - - - - +
- |
- | | +--------+ decrypt +--------------+ |
- +-----> |Disk Key| -----------> |Encrypted Data|
- | +--------+ +--------------+ |
-
- + - - - - - - - - - - - - - - - - - - - - - +
-
- Since the password is no longer used to directly decrypt the disk key to access
- the encrypted volume, knowledge of the user password or key, or any attack
- based on the user password or key is of no use unless the attacker is also in
- posession of the smart card containing the card key. Since a single card key
- can be used to decrypt multiple disk keys, it is possible for one card to
- control access to multiple encrypted volumes. Finally, since each card key can
- also contain extra information such as access rights to an encrypted volume, it
- is possible for a central authority to issue to cards which control the type of
- access allowed for the volume, such as cards which only grant read access to
- their holders.
-
- In order to convert a volume from direct user access to smart-card access, it
- is necessary to first initialise the card for use with SFS by generating a
- random card key and encrypting it with the user key, and then to bind the card
- to a disk volume by encrypting the disk key with the card key. The first,
- initialization step is performed as follows:
-
- Generate a random card key
- Encrypt the card key with the user key for the card
- Write the encrypted card key to the card
-
- Card key
- |
- v
- User key +---------+
- for --> | Encrypt |
- card +---------+
- |
- v
- Smart card
-
- To recover the card key, the user key for the card is used to decrypt the key
- stored on the card.
-
- The second, card binding step is performed as follows:
-
- Decrypt the disk key with the user key for the disk
- Decrypt the card key with the user key for the card
- Encrypt the disk key with the card key
- Write the encrypted disk key to the disk
-
- Smart card Disk volume
- | |
- v v
- User key +---------+ User key +---------+
- for --> | Decrypt | for --> | Decrypt |
- card +---------+ disk +---------+
- | |
- Card key Disk key
- | |
- | +-------------+
- | |
- | v
- | +---------+
- +--------> | Encrypt |
- +---------+
- |
- v
- Disk volume
-
- To recover the card key, the user key for the card is used to decrypt the key
- stored on the card. To recover the disk key, the decrypted card key obtained
- in the previous step is used to decrypt the key stored on the disk.
-
- To access a volume using the smart card, the following process is used:
-
- Decrypt the card key with the user key for the card
- Decrypt the disk key with the card key
- Access the volume using the disk key
-
- Smart card
- |
- v
- User key +---------+
- for --> | Decrypt |
- card +---------+
- |
- Card key Disk volume
- | |
- | v
- | +---------+
- +--------> | Encrypt |
- +---------+
- |
- v
- Disk key
-
- Access to the disk volume is now possible only to someone in posession of the
- smart card containing the card key. Since multiple disk keys can be encrypted
- with the same card key, one card can control access to a number of disk
- volumes. Since the card key and disk key are completely unrelated, acquiring
- the smart card containing the encrypted card key without the user key needed to
- decrypt it does not provide an attacker with access to the disk volume.
-
-
- Deletion of SFS Volumes
-
- Truly deleting data from magnetic media is very difficult. The problem lies in
- the fact that when data is written to the medium, the write head sets the
- polarity of most, but not all, of the magnetic domains. This is partially due
- to the inability of the writing device to write in exactly the same location
- each time, and partially due to the variations in media sensitivity and field
- strength over time and among devices.
-
- In general terms, when a one is written to disk, the media records a one, and
- when a zero is written, the media records a zero. However the actual effect is
- closer to obtaining something like 0.95 when a zero is overwritten with a one,
- and 1.05 when a one is overwritten with a one. Normal disk circuitry is set up
- so that both these values are read as ones, but using specialized circuitry it
- is possible to work out what previous `layers' contained (in fact on some
- systems it may be possible to recover previous data with a simple software
- modification to the hardware control code. Some drives are capable of
- performing "offset reads" to the side of the normal track centres (for example
- using SCSI diagnostic commands), so that if the original data was written to
- one side, the overwritten data can be recovered by reading from the other side
- and applying error-recovery techniques).
-
- This problem is further complicated by the fact that the heads might not pass
- exactly over the same track position when data is rewritten, leaving a trace of
- the old data still intact. Current-generation drives reduce this problem
- somewhat as track and linear densities have now become so high that the
- traditional optical methods of extracting information from the disk platters
- has become much more difficult, and in some cases impossible, as the linear bit
- cell is below the optical diffraction limit for visible light. While some data
- patterns can still be discerned, recognizing others would be limited to some
- subset of patterns. However the recovery of at least one or two layers of
- overwritten data isn't too hard to perform by replacing the drive electronics
- (but not the read/write head), reading off the "obvious" signal, computing what
- it would look like if written to blank media, subtracting that from what was
- actually read, and recovering the overwritten data from the difference, a
- technique which has occasionally been used by hard drive manufacturers to
- recover accidentally overwritten or lost data.
-
- Despite this small respite, when all the above factors are combined it turns
- out that each track on a piece of media contains an image of everything ever
- written to it, but that the contribution from each `layer' gets progressively
- smaller the further back it was made. Using even more advanced techniques than
- those mentioned above, like low energy electron scattering, ferrofluid with
- optical tracers, or related methods, followed by the same signal-processing
- technology which is used to clean up satellite images, low-level recorded
- speech, and the like, it is possible to recover layers of previous data with a
- remarkable degree of accuracy, to a level limited only by the sensitivity of
- the equipment and the amount of expertise of the organisation attempting the
- recovery. Intelligence organisations have a *lot* of expertise in this field.
-
- The general concept behind the overwriting scheme used by SFS is to flip each
- magnetic domain on the disk back and forth as much as possible (this is the
- basic idea behind degaussing) without writing the same pattern twice in a row.
- If the data was encoded directly, we could simply choose the desired overwrite
- pattern of ones and zeroes and write it repeatedly. However, disks always use
- an NRZI encoding scheme in which a 1 bit signifies an inversion, making the
- desired pattern a series of one bits. This leads to a further complication as
- all disks use some form of run-length limited (RLL) encoding, so that the
- adjacent ones won't be written. This encoding is used to ensure that
- transitions aren't placed too closely together, or too far apart, which would
- mean the drive would lose track of where it was in the data.
-
- The parameter to be aware of here is the separation of 1 bits. Floppies (which
- are a more primitive form of the technology used in hard disks) like to keep
- the 1 bits 4us apart. However they can't be kept too far apart or the read
- clock loses synchronisation. This "too far" figure depends a lot on the
- technology in the drive, it doesn't depend on the magnetic media much (unlike
- the "too close" figure, which depends a lot on the media involved). The first
- single-density encoding wrote one user data bit, then one "1" clock bit, taking
- a total of 8us. This was called FM, since a 1 bit was encoded as two
- transitions (1 wavelength) per 8 us, while a 0 bit was encoded as one
- transition (1/2 wavelength).
-
- Then it was discovered that it was possible to have two 0 bits between adjacent
- 1s. The resulting encoding of 4 bits into 5 was called group code recording
- (GCR) which was (0,2) RLL. This allowed 4 user data bits to be written in
- 5 * 4us = 20 us, for an effective time per user bit of 5 us, which was a big
- improvement over 8 us.
-
- But GCR encoding was somewhat complex. A different approach was taken in
- modified FM (MFM), which suppressed the 1 clock bit except between adjacent
- 0's. Thus, 0000 turned into 0(1)0(1)0(1)0 (where the ()s are the inserted
- clock bits), 1111 turned into 1(0)1(0)1(0)1, and 1010 turned into
- 1(0)0(0)1(0)0. The maximum time between 1 bits was now three 0 bits. However,
- there was at least one 0 bit, so it became possible to clock the bits at
- 2us/bit and not have more than one 1 bit every 4us. This achieved one user bit
- per 4 us, a result which was better than GCR and obtainable with simpler
- circuitry. As can be seen, the effective data rate depends on the bit rate
- (which has been 4 us, 4 us and 2 us in these examples) and the encoding rate, a
- measure of the encoding efficiency. The encoding rates have been 1/2, 4/5 and
- 1/2.
-
- There is also a (2,7) RLL code with rate 1/2, meaning that 1 user bit maps to 2
- encoded bits (although the mapping involves multi-bit groups and is somewhat
- complex), but there are always at least two 0 bits between 1 bits, so 1 bits
- happen at most every 3 bit times. This allows the clock to be reduced to
- 1.3333 us (this 2/1.33333 = 6/4 = 3/2 is the source of the added capacity gain
- of RLL hard drive controllers over equivalent MFM controllers). The time per
- user bit is now 2.6666 = 2 2/3 us.
-
- However, the faster clock is finicky. It is also possible to use a (1,7) RLL
- encoding. Since this is (1,x), the clock rate is 2 us/bit, but the encoding
- efficiency improves from 1/2 to 2/3. This allows 2 effective user bits per 6
- us, or 3 us per user bit. For hard drives, it is easier to increase the clock
- rate by an extra 9/8 than to fiddle with a clock 50% faster, so this is very
- popular with more recent disk drives.
-
- To erase magnetic media, we need to overwrite it many times with alternating
- patterns in order to expose it to a magnetic field oscillating fast enough that
- it does the desired flipping of the magnetic domains in a reasonable amount of
- time. Unfortunately, there is a complication in that we need to saturate the
- disk surface to the greatest depth possible, and very high frequency signals
- only "scratch the surface" of the magnetic medium. Disk drive manufacturers,
- in trying to achieve ever-higher densities, use the highest possible
- frequencies, whereas we really require the lowest frequency a disk drive can
- produce. Even this is still rather high. In fact, a full erasure of the disk
- is generally impossible to perform, as can be seen by considering the manner in
- which servo information is encoded.
-
- All disk drives need what is called "servo" information indicating where the
- head is relative to the centre of the track it is trying to read. Older drives
- used one of two techniques, the first of which was the "separate servo"
- technique in which one surface of the stack of platters was reserved for servo
- information used to control the stack of arms for the other platters. This
- gave a high update rate for the servo information, but didn't allow for any
- adjustment for differential expansion of the platters or arms, or tilting of
- the stack of arms. This multiplexing in space of the servo information was
- limited by the accuracy to which two heads could be aligned in the same stack.
-
- The other technique was the "embedded servo" technique in which the servo
- information was placed on the same platter as the data, between the data blocks
- in the header area. This meant that the same head was used to read the servo
- information and disk data, but led to gaps while reading and writing the data.
- This technique multiplexes the servo information in time.
-
- A newer technique is called "servo-under" (also known as "dedicated servo"),
- and writes the servo information "under" the data, at a lower frequency. The
- normal data write frequencies are high enough, and the write signal weak
- enough, that it never penetrates to the bottom of the the magnetic media and so
- the servo is unaffected by writing to the disk (in fact, even if the head was
- writing DC it probably wouldn't be able to generate a strong enough signal to
- penetrate to the servo data). A couple of filters separate the servo
- information from the data, so that the servo information can be accessed from
- the same head and at the same time as the data is read. This technique is very
- convenient, but shows that the media can never be completely saturated, since
- doing this would erase the servo information, making the disk useless.
-
- However, what we can do is to use the lowest frequency possible for overwrites,
- to penetrate as deeply as possible into the recording medium. To do this, we
- must determine what decoded data to write to produce a low-frequency encoded
- signal.
-
- The three most common RLL codes are (1,3) RLL (usually known as MFM), (2,7) RLL
- (the original "RLL" format), and (1,7) RLL (which is popular in newer drives).
- These three cover the vast majority of magnetic disk drives. However, each of
- these has several possible variants. With MFM, only one is used with any
- frequency, but the newest (1,7) RLL code has at least half a dozen variants in
- use. (1,3) RLL and (2,7) RLL have coding rates of 1/2, meaning that one
- "useful" decoded bit requires two encoded bits. (1,7) RLL has a rate of 2/3,
- so two decoded bits correspond to three encoded bits.
-
- For MFM, there are at most 4 bit times between transitions, so the lowest
- frequency possible is "4T", attained by the repeating decoded data patterns
- 1010 and 0101. These have a 1 bit every other "data" bit, and the intervening
- "clock" bits are all 0. To complete the set, we would like patterns with every
- other clock bit set to 1 and all others set to 0, but these are not possible in
- the MFM encoding (such "violations" are used to generate special marks on the
- disk to identify sector boundaries).
-
- The best we can do is 3 bit times between transitions, a 3T frequency, which is
- generated by repeating the decoded patterns 100100, 010010 and 001001. We
- should use several of rounds of these patterns, as MFM drives are the oldest,
- lowest-density drives around (this is especially true for the very-low-density
- floppy drives). As such, they are the easiest to recover data from with modern
- equipment and we need to take the most care with them.
-
- For (1,7) RLL, although there can be as many as 8 bit times between
- transitions, the lowest sustained frequency we can have is 6T, 6 bit times
- between transitions. This is a desirable property from the point of view of
- the clock-recovery circuitry, and all (1,7) RLL codes seem to have this
- property. We now need to find a way to write the desired pattern without
- knowing the particular (1,7) RLL code used. We can do this by looking at the
- way the drive's error-correction system works. The error-correction is applied
- to the decoded data, even though errors generally occur in the encoded data.
- In order to make this work well, the data encoding should have limited error
- amplification, so that an erroneous encoded bit should affect only a small,
- finite number of decoded bits.
-
- Decoded bits therefore depend only on nearby encoded bits, so that a repeating
- pattern of encoded bits will correspond to a repeating pattern of decoded bits.
- The repeating pattern of encoded bits is 6 bits long. Since the rate of the
- code is 2/3, this corresponds to a repeating pattern of 4 decoded bits. There
- are only 16 possibilities for this pattern, making it feasible to write all of
- them during the erase process. So to achieve good overwriting of (1,7) RLL
- disks, we write the patterns 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111,
- 1000, 1001, 1010, 1011, 1100, 1101, 1110, and 1111. These patterns also
- conveniently cover two of the ones needed for MFM overwrites, although we
- should add a few more iterations of the MFM-specific patterns for the reasons
- given above.
-
- For (2,7) RLL, we again have 6T as the lowest systainable frequency, which
- maps, at an encoding rate of 1/2, to some repeating pattern of 3 decoded bits,
- so we require all 8 possible 3-bit repeating patterns. The all-zeros and
- all-ones patterns overlap with the (1,7) RLL patterns, leaving six others:
-
- 001001001001001001001001
- 2 4 9 2 4 9
-
- in binary or 0x24 0x92 0x49, 0x92 0x49 0x24 and 0x49 0x24 0x92 in hex, and
-
- 011011011011011011011011
- 6 D B 6 D B
-
- in binary or 0x6D 0xB6 0xDB, 0xB6 0xDB 0x6D and 0xDB 0x6D 0xB6 in hex. The
- first three are the same as the MFM patterns, so we need only three extra
- patterns to cover (2,7) RLL drives.
-
- Although (1,7) is more popular in recent drives, some hard drives do still use
- (2,7) RLL. These three patterns also cover any problems with endianness
- issues, which weren't a concern in the previous two cases, but would be in this
- case (actually, thanks to the strong influence of IBM mainframe drives,
- everything seems to be uniformly big-endian within bytes, with the most
- significant bit being written to the disk first).
-
- For the latest crop of high-density drives which use methods like Partial-
- Response Maximum-Likelihood (PRML) methods which may be roughly equated to the
- trellis encoding done by V.32 modems in that it is effective but
- computationally expensive, all we can do is write a variety of random patterns,
- because the processing inside the drive is too complex to second-guess.
- Fortunately, these drives push the limits of the magnetic media much more than
- the old MFM drives ever did by encoding data with much smaller magnetic
- domains, closer to the physical capacity of the magnetic media. If these
- drives require sophisticated signal processing just to read the most recently
- written data, reading overwritten layers is also correspondingly more
- difficult. A good scrubbing with random data will do about as well as can be
- expected.
-
- We now have a set of 22 overwrite patterns which will erase everything,
- regardless of the raw encoding. The basic disk eraser can be improved slightly
- by adding random passes before, after, and during the erase process, and by
- performing the deterministic passes in random order to make it more difficult
- to guess which of the known data passes were made at which point.
-
- To deal with all this in one overwrite pattern, SFS uses the following sequence
- of 35 consecutive writes to erase data:
-
- 1. Random
- 2. Random
- 3. Random
- 4. Random
- 5. 01010101 01010101 01010101 0x55 (1,7) RLL MFM
- 6. 10101010 10101010 10101010 0xAA (1,7) RLL MFM
- 7. 10010010 01001001 00100100 0x92 0x49 0x24 (2,7) RLL MFM
- 8. 01001001 00100100 10010010 0x49 0x24 0x92 (2,7) RLL MFM
- 9. 00100100 10010010 01001001 0x24 0x92 0x49 (2,7) RLL MFM
- 10. 00000000 00000000 00000000 0x00 (1,7) RLL (2,7) RLL
- 11. 00010001 00010001 00010001 0x11 (1,7) RLL
- 12. 00100010 00100010 00100010 0x22 (1,7) RLL
- 13. 00110011 00110011 00110011 0x33 (1,7) RLL
- 14. 01000100 01000100 01000100 0x44 (1,7) RLL
- 15. 01010101 01010101 01010101 0x55 (1,7) RLL MFM
- 16. 01100110 01100110 01100110 0x66 (1,7) RLL
- 17. 01110111 01110111 01110111 0x77 (1,7) RLL
- 18. 10001000 10001000 10001000 0x88 (1,7) RLL
- 19. 10011001 10011001 10011001 0x99 (1,7) RLL
- 20. 10101010 10101010 10101010 0xAA (1,7) RLL MFM
- 21. 10111011 10111011 10111011 0xBB (1,7) RLL
- 22. 11001100 11001100 11001100 0xCC (1,7) RLL
- 23. 11011101 11011101 11011101 0xDD (1,7) RLL
- 24. 11101110 11101110 11101110 0xEE (1,7) RLL
- 25. 11111111 11111111 11111111 0xFF (1,7) RLL (2,7) RLL
- 26. 10010010 01001001 00100100 0x92 0x49 0x24 (2,7) RLL MFM
- 27. 01001001 00100100 10010010 0x49 0x24 0x92 (2,7) RLL MFM
- 28. 00100100 10010010 01001001 0x24 0x92 0x49 (2,7) RLL MFM
- 29. 01101101 10110110 11011011 0x6D 0xB6 0xDB (2,7) RLL
- 30. 10110110 11011011 01101101 0xB6 0xDB 0x6D (2,7) RLL
- 31. 11011011 01101101 10110110 0xDB 0x6D 0xB6 (2,7) RLL
- 32. Random
- 33. Random
- 34. Random
- 35. Random
-
- The MFM-specific patterns are repeated twice because MFM drives are generally
- the lowest density, and are thus particularly easy to examine. The
- deterministic patterns between the random writes are permuted before the write
- is performed, to make it more difficult for an opponent to use knowledge of the
- erasure data written to attempt to recover overwritten data (in fact we need to
- use a cryptographically strong random number generator to perform the
- permutations to avoid an opponent who can read the last overwrite pass being
- able to predict the previous passes and "echo cancel" passes by subtracting the
- known overwrite data).
-
- If the device being written to supports cacheing or buffering of data, SFS will
- additionally attempt to disable the buffering to ensure physical disk writes
- are performed (for example by setting the Force Unit Access bit during SCSI-2
- Group 1 write commands).
-
- There is a commonly-held belief that there is a US government standard for
- declassifying magnetic media which simply involves overwriting data on it three
- times. There are in fact a number of standards[1] which contain simple phrases
- such as "Magnetic disks, drums, and other similar rigid storage devices shall
- be sanitized by overwriting all storage locations with any character, then the
- complement of the character (e.g., binary ones and binary zeros) alternately a
- minimum of three times". However this simple description is usually
- reinterpreted by the appropriate government agencies to a level which often
- makes physical destruction of the media and its replacement with new media
- preferable to attempting any declassification by overwriting the data (the
- (classified) standards for truly declassifying magnetic media probably involve
- concentrated acid, furnaces, belt sanders, or any combination of the above[2]).
- One unclassified standard which contains information on data erasure is ANSI
- X9.17-1985 "Financial Key Management (Wholesale)", Appendix E, "Erasing
- (Zeroizing) Recording Media Used for Storage of Keying Material". Section E.5,
- "Disk Pack" contains three possible alternatives: "(a) Apply an emery wheel or
- sander to the recording surface of an inoperative disk. Ensure that the entire
- surface is completely removed prior to disposal. (b) The resin binder and
- ferric oxide surfece can be completely removed/stripped (chemically destroyed)
- from the disk with concentrated hydrochloric acid (55-58%). (c) Melting".
- This applies only to non-classified data, the sanitizing of media containing
- classified information is generally handled in a far more paranoid manner.
-
- The use of such extreme measures is necessary not only because data recovery
- from the actual tracks itself may (with some effort) be possible, but because
- of factors like intelligent drives which keep so-called "alternative cylinders"
- on a disk free to allow dynamic re-allocation of data to one of these tracks in
- case a disk block develops errors. The original block is no longer accessible
- through any sort of normal access mechanism, and the data on it can't be
- destroyed without going through some unusual contortions which tend to be
- highly hardware-dependant. Other complications include the use of journaling
- filesystems which keep track of previous generations of data, and disk
- compression software or hardware which will compress a simple repeated
- overwrite pattern to nothing and leave the original data intact on the disk[3].
- Therefore if "overwriting all storage locations" is interpreted to mean
- "exposing the entire reading surface to a magnetic field having a strength at
- the recording surface greater than the field intensity at which it was
- recorded", the method does have merit. Unfortunately it is virtually
- impossible to get at all storage locations, and simple-minded methods such as
- trying to write patterns to the storage allocated to a file in order to erase
- it don't even come close to this target. The overwrite method used by SFS does
- come reasonably close by attempting to create a fluctuating magnetic field over
- all physically accessible sectors which make up a disk volume, creating the
- same effect as a degaussing tool used to erase magnetic fields.
-
- One possibility for lessening the impact of encrypted data being retained on a
- disk long after it should have vanished is to frequently change an encryption
- parameter so that the encrypted data also changes when the encryption parameter
- is changed. For the case of SFS disk keys, all this would require is that at
- certain intervals (for example each time the volume is mounted) the disk key is
- re-encrypted with a different random initialization vector and written back to
- the disk so that the key never has time to get "burned in" to the media.
- However this still leaves the problem of the last-written key being present,
- and is probably not worth the effort involved in constantly updating the key.
-
- Another consideration which needs to be taken into account when trying to erase
- data through software is that drives conforming to some of the higher-level
- protocols such as the various SCSI standards are relatively free to interpret
- commands sent to them in whichever way they choose (as long as they still
- conform to the SCSI specification). Thus some drives, if sent a FORMAT UNIT
- command may return immediately without performing any action, may simply
- perform a read test on the entire disk (the most common option), or may
- actually write data to the disk[4][5]. This is rather common among newer
- drives which can't directly be low-level formatted, unlike older ST-412 and
- ST-506 MFM or RLL drives. For example trying to format an IDE drive generally
- has little effect - a low-level format generally isn't possible, and the
- standard DOS `format' command simply writes a boot record, FAT, and root
- directory, performs a quick scan for bad sectors, and exits.
-
- Therefore if the data is very sensitive and is stored on floppy disk, it can
- best be destroyed by removing the media from the disk liner and burning it.
- Disks are relatively cheap, and can easily be replaced. Permanently destroying
- data on fixed disks is less simple, but the multiple-overwrite option used by
- SFS at least makes it quite challenging (and expensive) to recover any
- information.
-
- Footnote [1]: Among others there is the Department of Defense standard DoD
- 5200.28-M, Army standard AR 380-19, Navy standards OPNAVINST
- 5510.1H and NAVSO P-5239-26, Air Force standard AFSSI-5020, and
- Department of Energy standard DOE 5637.1.
-
- Footnote [2]: The UK Ministry of Defence grinds down disk platters and then
- supposedly stores the (still-classified) dust for a decade or
- more. Rumours that they remove programmers brains for storage in
- order to erase the classified information they contain are
- probably unfounded.
-
- Footnote [3]: From a posting to the usenet alt.security newsgroup on 1 August
- 1994, message-ID <31c75s$pa8@search01.news.aol.com>: "I got fired
- from my job and was told to clean my desk, so I immediately went
- to my office and ran Norton WipeDisk on my hard drive, which
- contained much of the work I had done and also my contact list,
- letters, games, and so on. Unfortunately, I had DoubleSpaced it
- and the files were easily recovered".
-
- Footnote [4]: Again it may be possible to bypass this using highly hardware-
- specific methods. For example Quantum SCSI drives manufactured a
- few years ago could be forced to write data to disk during a
- format by changing the sector filler byte before the format
- command was issued.
-
- Footnote [5]: The SCSI-2 standard includes an initialization pattern (IP)
- option for the FORMAT UNIT command (Section 8.2.1.2), however it
- is not clear how well this is supported by existing drives.
-
-
- Safeguarding Cryptographic Keys
-
- A threshold scheme divides a message (in this case the key to be protected)
- into `n' pieces, or shares, so that any `m' of these shares can be used to
- reconstruct the key, but `m-1' or less cannot reconstruct it. This is called
- an (m,n) threshold scheme.
-
- A simple all-or-nothing scheme would break a key into `n' parts such that the
- key could be recovered by taking the exclusive-or of these parts. However this
- method allows no margin of error, since even a single missing share will
- destroy the ability to recreate the key. This method allows for a limited form
- of key safeguarding, but is not a true threshold scheme.
-
- SFS uses a true threshold scheme, namely the one presented in "How to Share a
- Secret" by Adi Shamir, Communications of the ACM, Vol.22, No.11 (November
- 1979), p.612. This involves chosing a prime `p' which is larger than the
- number of shares required, and an arbitrary polynomial of degree `m-1', where
- `m' is the number of shares necessary to reconstruct the secret. To distribute
- the secret data, we generate a polynomial:
-
- ax^(m-1) + bx^(m-2) ... + cx + M (mod p)
-
- where `a' ... `c' are random secret coefficients which are discarded once the
- data has been distributed, `p' is a prime number larger than any of the
- coefficients, and `M' is the secret to be distributed. For example if we
- wanted to create a (3,n) threshold scheme in which three shares out of the
- total number would be necessary to reconstruct the secret data, we would
- generate the quadratic polynomial:
-
- ax^2 + bx + M (mod p)
-
- The shares themselves are obtained by evaluating the polynomial at `n'
- different points, where `n' is the total number of shares distributed. In this
- case for our polynomial f() we evaluate it at x = 1, x = 2, ... x = n, and
- distribute the resulting f( 1 ), f( 2 ), ... f( n ) values as the share data.
- Since the polynomial has `m' unknown coefficients a, b, ... c and M, any `m'
- shares can be used to create `m' equations, after which linear algebra can be
- used to solve for M. `m-1' shares can't recreate the secret. More than `m'
- shares are redundant.
-
- For example, suppose we wish to create a (3,5) threshold scheme in which any 3
- of a total of 5 shares are sufficient to reconstruct the secret M. Assuming
- M = 5 and taking two random coefficients a = 4 and b = 6 and a prime p = 13, we
- have:
-
- f(x) = 4x^2 + 6x + 5 (mod 13)
-
- Evaluating this at x = 1...5 gives:
-
- f(1) = 4 + 6 + 5 (mod 13)
- = 2
- f(2) = 16 + 12 + 5 (mod 13)
- = 7
- f(3) = 36 + 18 + 5 (mod 13)
- = 7
- f(4) = 64 + 24 + 5 (mod 13)
- = 2
- f(5) = 100 + 30 + 5 (mod 13)
- = 5
-
- To reconstruct `M' from three of these shares (for example share 1, 3, and 5),
- we need to solve the set of linear equations:
-
- 2 = a.1^2 + b.1 + M (mod 13)
- 7 = a.3^2 + b.3 + M (mod 13)
- 5 = a.5^2 + b.5 + M (mod 13).
-
- We can do this using Lagrange interpolation to recover the values a = 4, b = 6,
- and M = 13, giving us the original secret.
-
- A single share therefore consists of an X coordinate 1, 2, ... n, and however
- many Y coordinates f( 1 ), f( 2 ), ... f( n ) are needed. The total set of
- shares is a set of X coordinates X[] and a corresponding number of Y
- coordinates Y[].
-
- To reconstruct the secret, we first calculate the coefficients C[]:
-
- for( i = 0; i < m; i++ )
- C[ i ] = 1;
- for( j = 0; j < m; j++ )
- if( i != j )
- C[ i ] *= X[ j ] / ( X[ i ] - X[ j ] );
-
- Once we have the coefficients, we can reconstruct the secret from the rest of
- the share, namely the Y coordinates:
-
- secret = 0;
- for( i = 0; i < m; i++ )
- secret += C[ i ] * Y[ i ];
-
- To make the secret-sharing task easier, we use a finite field for all our work.
- An exact explanation of the theory involved is beyond the scope of this
- document, anyone interested in more background details should refer to the
- references in the "Recommended Reading" section below. It suffices to note
- that the secret sharing scheme used by SFS works in GF( 2^n ), in which
- addition, subtraction, multiplication, and division are all well defined.
- There is an additive identity 0, a multiplicative identity 1, every nonzero
- number has a unique inverse, and the commutative, associative, and distributive
- laws all hold. Galois fields are very convenient to work with since they keep
- the numbers involved to a finite size, and there are no rounding errors in
- division.
-
- All arithmetic is done modulo an irreducible polynomial of degree `n'. The
- characteristics of the Galois fields used by SFS are as follows:
-
- n Max.no of shares Generator polynomial
-
- 4 15 x^4 + x + 1
- 5 31 x^5 + x^2 + 1
- 6 63 x^6 + x + 1
- 7 127 x^7 + x + 1
- 8 255 x^8 + x^4 + x^3 + x^2 + 1
- 9 511 x^9 + x^4 + 1
- 10 1023 x^10 + x^3 + 1
- 11 2047 x^11 + x^2 + 1
- 12 4095 x^12 + x^6 + x^4 + x + 1
- 13 8191 x^13 + x^4 + x^3 + x + 1
- 14 16383 x^14 + x^5 + x^3 + x + 1
- 15 32767 x^15 + x + 1
- 16 65535 x^16 + x^5 + x^3 + x^2 + 1
-
- Although SFS could use any of GF( 2^4 ) ... GF( 2^16 ), the current
- implementation is restricted to GF( 2^8 ) to allow data to be processed in
- byte-sized quantities.
-
- The use of a threshold scheme can be extended to allow shareholders to be
- replaced or added at a later point using the techniques given in David Chaum's
- paper "How to Keep a Secret Alive", presented during Crypto'84 and appearing in
- volume 196 of the Lecture Notes in Computer Science published by
- Springer-Verlag (ISBN 3-540-15658-5 and 0-387-15658-5). The replacement of
- a shareholder requires that a shareholder break his share into subshares and
- distribute the subshares to the other shareholders, allowing them to recreate
- his share if it is lost or destroyed. The creation of a new shareholder
- requires the generation of additional shares which are again broken into
- subshares and distributed to shareholders. In either case a quorum of
- shareholders can use their subshares to vote on whether a shareholder should be
- replaced, or a new shareholder created. SFS does not currently support this
- extended key safeguarding due to the complexity involved in its use.
-